Foundations of Data Science

Om

Section 1

Quarto

Quarto enables you to weave together content and executable code into a finished presentation.

To learn more about Quarto presentations see https://quarto.org/docs/presentations/.

Bullets

When you click the Render button a document will be generated that includes:

  • Content authored with markdown
  • Output from executable code

Code

When you click the Render button a presentation will be generated that includes both content and the output of embedded code.

You can embed code like this:

1 + 1
[1] 2

Case studies

Case study 1: IMDb dataset

Case study 1: IMDb dataset

The Internet Movie Database (IMDb) is a popular website for film reviews.

  • Can we automatically identify the sentiment of a review?
  • Can we predict the IMDb rating of a film, based on its reviews?

There are also a lot of practical problems:

  • How do we pre-process and clean the data?
  • How do we convert words into numbers?

Case study 2: sales price prediction

What is the right price for a house in Ames, Iowa?

In statistical terms: can you predict the final price based on the characteristic of the house?

This is a Kaggle playground competition. You will be asked to join the competition along with other thousands of players!

Case study 3: the data challenge

The dataset will be disclosed at the kick-off date!

Your goal will be to construct a model which has good predictive performance.